## -------------------------------------------------- #
## Frank_2021_II_README.txt
## -------------------------------------------------- #

 Date: July 27, 2021

 Author: Richard W. Frank 

 Title: Human Trafficking Indicators: A New Dataset

 Journal: International Interactions

 Contact Information: Richard Frank <richard.frank @ anu.edu.au>

  
# ---------------------------------------------------------------------------------------------------- #
# file and folder descriptions
# ---------------------------------------------------------------------------------------------------- #

## A bit of background ##		
  
  - These do files were created and run using Stata 14 - 16.1 on a 2016 MacBook Pro. 
  - Most files require changing the working directory to the location of the downloaded replication data. Replace the "~" in each file with the file path of the working directory where you placed these replication files.
  - Most of the EBA models were run in AWS on several parallel EC2 instances to save computing time. The files that run EBA models are broken up into fourteen do files. This was done so that models could be run simultaneously across multiple AWS instances.
  - The IRT models were run in R on the 2016 MacBook Pro computer.
  - The data files generated by the EBA and IRT models are currently (as a whole) over 41GB in size. 
  - Therefore, I did not think it practical to upload all to Dataverse. The do files below should generate results files identical to ours. If you are interested in a particular results data file, send me an email, and I can email you a link to it.  
  - As I am but human, you may come across mistakes in my code or data. If you do, I would really appreciate it if you could send me an email and let me know. I plan on updating the HTI data yearly, and I will include a list of any corrections made from the previous iteration of the HTI data.
  - Thanks for your interest in this research, and I hope you find these replication files useful!



  ########
  # Data #
  ########

## The main files of interest.

     HTI_v1.dta -- These are the main 2001-2017 HTI data used in the figures, tables, and appendix.

     Table_3_merging_data.do -- This file merges various datasets of independent variables used in the EBA analysis. Given the number of sources used and the effort put into generating these datasets, I do not include the finished merged data. Rather I give links to the websites where you can download these data. This helps these datasets keep track of how many people are actually using their data.

     HTI_v1_errata.do -- As I cleaned the files below for publication, I came across a few mistakes or omissions in the v1 data. Sed utique. This file shows what final corrections were made to the data.

## Most of the additional files in this folder help create the merged dataset used in the EBA analysis. The data sources for these data are listed in the table_3_merging_data.do file.

     ccode_to_name.do -- This file creates consistent country names using Correlates of War (COW) country codes.
     cleaning_3P.do -- This file cleans Prof. Cho's 3P Index data.
     cleaning_Alesina_et_al_2003.do -- This file cleans Alesina et al.'s (2003) replication data
     cleaning_Cho_et_al_2013.do -- This file cleans Cho et al.'s (2013) replication data.
     cleaning_CI_rights.do -- This file cleans CIRI and CI-rights data.
     cleaning_CIA.do -- This file cleans CIA Facebook data.
     cleaning_cpi.do -- This file cleans CPI data.
     cleaning_dpi.do -- This file cleans DPI data.
     cleaning_fh.do --This file cleans Freedom House data.
     cleaning_fh_media.do -- This file cleans Freedom House's media freedom data.
     cleaning_ICRG.do -- This file cleans ICRG data.
     cleaning_IPI -- This file cleans IPI peacekeeping data.
     cleaning_La_Porta_et_al_1998.do -- This file cleans La Porta et al. (1998) replication data.
     cleaning_Neumayer_2006.do -- This file cleans Neumayer (2006) replication data.
     cleaning_polity.do -- This file cleans Polity IV data.
     cleaning_UNODC.do -- This file cleans UNODC's recent cross-sectional-time-series spreadsheets.
     cleaning_UNODC_2006.do -- This file cleans UNODC cross-sectional data from their 2006 report on flows' intensity.
     cleaning_wgi.do -- This file cleans the WGI data.
     cow.do -- This file creates  country codes.
     creating_WB_region_codes.do -- This file creates regional dummies using World Bank definitions.
     oecd.do -- This file creates a variable coding OECD membership.
 

  ##################
  # Figures folder #
  ##################
 
     Figure_1.do -- This file creates Figure 1.
     Figure_2.do -- This file creates Figure 2. 
     Figure_3.do -- This file creates Figure 3. 

  ################
  # Table folder #
  ################

## Table_3_models folder ## 

  -This folder includes four subfolders, each including the do files necessary to run a series of EBA models on a particular dependent variable. Given the number of different models run, these runs are broken into three or four different do files per dependent variable. Do let me know if you have any questions about how these models were put together or run. 

  -These files rely on two add-ons: tuples and regsave. If you do not have these installed, type "findit" in the Stata command line. 

  -These models took in total about 540 hours of computer time to run. The 59 UNODC models each took about 3-3.5 hours to run and the HTI dependent variable models took about two hours to run.

  -Models with conditional statements "e.g. cond(!1&2&3)" are included in some do files. These statements exclude some combinations of variables and were added during the model estimation process due to some models not converging due to their inclusion. 


 ## UNODC folder -- This folder includes EBA ologit models using the cross-sectional UNODC (2006) destination variable.

     EBA_UNODC_1-5.do -- This file runs the first 5 series of ordered logit models using the UNODC's destination variable.
     EBA_UNODC_6-30.do -- This file runs the 6th to 30th ordered logit models using the UNODC's destination variable.
     EBA_UNODC_31-59.do -- This file runs the 31st to 59th ordered logit models using the UNODC's destination variable.

 ## HTI dest folder -- This folder includes EBA logit models using the HTI destination variable.

     EBA_dest_1-5.do -- This file runs the first 5 series of logit models using the HTI's destination dependent variable.
     EBA_dest_6-30.do -- This file runs the 6th to 30th series of logit models using the HTI's destination dependent variable.
     EBA_dest_31-58.do -- This file runs the 31st to 58th series of logit models using the HTI's destination dependent variable.

 ## HTI pdest folder -- This folder includes EBA logit models using the HTI forced sexual exploitation destination variable.

     EBA_pdest_1-15.do -- This file runs the first 5 series of logit models using the HTI's sex trafficking destination dependent variable.
     EBA_pdest_16-30.do -- This file runs the 6th to 30th series of logit models using the HTI's sex trafficking destination dependent variable.
     EBA_pdest_31-45.do -- This file runs the 31st to 45th series of logit models using the HTI's sex trafficking destination dependent variable.
     EBA_pdest_46-58.do -- This file runs the 46th to 58th series of logit models using the HTI's sex trafficking destination dependent variable.

 ## HTI ldest folder -- This folder includes EBA logit models using the HTI forced labor destination variable.

     EBA_ldest_1-15.do -- This file runs the first 5 series of logit models using the HTI's labor trafficking destination dependent variable.
     EBA_ldest_16-30.do -- This file runs the 6th to 30th series of logit models using the HTI's labor trafficking destination dependent variable.
     EBA_ldest_31-45.do -- This file runs the 31st to 45th series of logit models using the HTI's labor trafficking destination dependent variable.
     EBA_ldest_46-58.do -- This file runs the 46th to 58th series of logit models using the HTI's labor trafficking destination dependent variable.

 ## Table_3_results folder --This folder includes files that summarize the results created by the do files in the model folders.

     1_UNODC_EBA_results.do -- This file generates summary statistics for the cross-sectional UNODC models that are saved in the spreadsheets below.  
     2_dest_EBA_results.do -- This file generates summary statistics for the HTI destination models that are saved in the spreadsheets below.  
     3_pdest_EBA_results.do -- This file generates summary statistics for the HTI sex trafficking destination models that are saved in the spreadsheets below.  
     4_ldest_EBA_results.do -- This file generates summary statistics for the HTI sex trafficking destination models that are saved in the spreadsheets below.
  
     5_Table_3_results_summarized.do -- This file merges the four spreadsheets below, determines which variables are robust, and summarizes the relevant variables for inclusion in Table 3. In Appendix E, the Table_E3.do file also merges these spreadsheets and generates the summary information for all EBA model variables.

     UNODC_2006_EBA_results.xlsx -- Excel file with summary statistics (means, s.d., %sign, CDF>0) created by 1_UNODC_EBA_results.do.
     HTI_dest_EBA_results.xlsx -- Excel file with summary statistics (means, s.d., %sign, CDF>0) created by 2_dest_EBA_results.do.
     HTI_pdest_EBA_results.xlsx -- Excel file with summary statistics (means, s.d., %sign, CDF>0) created by 3_pdest_EBA_results.do.
     HTI_ldest_EBA_results.xlsx -- Excel file with summary statistics (means, s.d., %sign, CDF>0) created by 4_ldest_EBA_results.do.
 
  ###################
  # Appendix folder #
  ###################

## Appendix A folder ##

     Figure_A1.do -- This file creates Figure A1.
     Figure_A1.xlsx -- The data used to create Figure A1.

## Appendix B folder ##

     Table_B1.do -- This file generates the info used in Table B1.
     Table_B2.do -- This file generates the info used in Table B2.
     Table_B3.do -- This file generates the info used in Table B2.

## Appendix C folder ##

     Figures_C1-C12.do -- This file generates the maps in Figures C1-12.
     Figure_C13.do -- This file generates Figure C13.
     Figure_C14.do -- This file generates Figure C13.
     Figure_C15.do -- This file generates Figure C13.
     Figure_C16.do -- This file generates Figure C13.
     Figure_C17.do -- This file generates Figure C13.

## Appendix E folder ##

     Figure_E1.do -- This file generates Figure E1.
     Figure_E2.do -- This file generates Figure E2.
     Table_E1.do -- This file generates the info used in Table E1.
     Table_E2.do -- This file generates the info used in Table E2.
     Table_E3.do -- This file generates the info used in Table E3.
     Table_E3.dta -- This file is the EBA results summary data used for Table E3.

## Appendix F folder ##

     Figure_F1.do -- This file generates Figure F1.
     Figure_F2.do -- This file generates Figure F2.
     Figure_F2.xlsx -- These data are used to generate Figure F2.
     Figure_F3.do -- This file generates Figure F3.

## Appendix G folder ##
 
  ## Analysis subfolder  ## This folder includes the files used to actually run the IRT models and analyze the results. The model runs (especially the dynamic) can take several hours.  

     1_cleaning_for_irt.do -- This do file cleans the HTI data in Stata before they are read into R.
     2_stan_data_prep.R -- This file gets the data ready for Stan in R.
     3_static_models.R -- This file runs the static IRT model.
     4_dynamic_models.R -- This file runs the dynamic IRT model.
     5_static_analysis.R -- This file analyses the static results.
     6_dynamic analysis.R -- This file analyses the dynamic results.
     Figure G1.do -- This file creates Figure G1.
     Figure_G6.do -- This file creates Figure G6.
     HTI_stan_data_prepped.RData -- These are the merged data used in all IRT models.
     M2_dynamic.stan -- This is the Stan model file called by the 4_dynamic_models.R file.
     Table G3.do -- This file generates the info used in Table G3.

  ## Latent estimates folder -- This folder includes two subfolders, one for static and one for dynamic alphas, betas, and thetas. 
 

## -------------------------------------------------- #
## end of file
## -------------------------------------------------- #
















